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(Dated: November 1, 2006) 

We report a new measurement of the tt production cross section in pp collisions at a center-of- 
mass energy of 1.96 TeV using events with one charged lepton (electron or muon), missing transverse 
energy, and jets. Using 425 pb _1 of data collected using the DO detector at the Fermilab Tevatron 
Collider, and enhancing the tt content of the sample by tagging b jets with a secondary vertex 
tagging algorithm, the tt production cross section is measured to be: 

°pp^tt+x = 6-6 ± 0.9 (stat + syst) ± 0.4 (lum) pb. 

This cross section is the most precise DO measurement to date for it production and is in good 
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agreement with standard model expectations. 
PACS numbers: 13.85.Lg, 13.85.Ni, 13.85.Qk, 14.65.Ha 



I. INTRODUCTION 

The top quark was discovered at the Fermilab Tevatron 
Collider in 1995 [l], Q and completes the quark sector 
of the three-generation structure of the standard model 
(SM). It is the heaviest known elementary particle with 
a mass approximately 40 times larger than that of the 
next heaviest quark, the bottom quark. It differs from 
the other quarks not only by its much larger mass, but 
also by its lifetime which is too short to build hadronic 
bound states. The top quark is one of the least-studied 
components of the SM, and the Tevatron, with a center 
of mass energy of ^fs = 1.96 TeV, is at present the only 
accelerator where it can be produced. The top quark 
plays an important role in the discovery of new particles, 
as the Higgs boson coupling to the top quark is stronger 
than to all other fermions. Understanding the signature 
and production rate of top quark pairs is a crucial in- 
gredient in the discovery of new physics beyond the SM. 
In addition, it lays the ground for measurements of top 
quark properties at DO. 

The top quark is pair-produced in pp collisions through 
quark-antiquark annihilation and gluon-gluon fusion. 
The Feynman diagrams of the leading order (LO) sub- 
processes are shown in Fig. [TJ At Tevatron energies, the 
qq — > tt process dominates, contributing 85% of the cross 
section. The gg — > tt process contributes the remaining 
15%. 




FIG. 1: Leading order Feynman diagrams for the production 
of tt pairs at the Tevatron. 

The total top quark pair production cross section for a 
hard scattering process initiated by a pp collision at the 
center of mass energy y/s is a function of the top quark 
mass nit and can be expressed as 

a p ^ +x (s,m t )= J dxtdxjMx^v 2 ) (1) 

i,3=q,q,g 



xfjix^^a'^ip, m 2 ,a s (p 2 ), p 2 ). 

The summation indices i and j run over the light 
quarks and gluons, Xi and Xj are the momentum frac- 
tions of the partons involved in the pp collision, and 
fi(xi, fi ) and fj{xj, /i 2 ) are the parton distribution func- 
tions (PDFs) for the proton and the antiproton, respec- 
tively. a l ^ tt {p, m 2 , a s (n 2 ), fi 2 ) is the total short dis- 
tance cross section at s = Xi • Xj ■ s, and is computable as 
a perturbative expansion in a s . The renormalization and 
factorization scales are chosen to be the same parameter 

/i, with dimensions of energy, and p = -^p- The the- 
oretical uncertainties on the tt cross section arise from 
the choice of \i scale, PDFs, and a s . For the most re- 
cent calculations of the top quark pair production cross 
section, the parton-level cross sections include the full 
NLO matrix elements Q , and the resummation of lead- 
ing (LL) [i[ and next-to-leading (NLL) soft logarithms 
[5| appearing at all orders of perturbation theory. For a 
top quark mass of 175 GeV, the predicted SM tt produc- 
tion cross section is 6.7lg;g pb -1 6]. Deviations of the 
measured cross section from the theoretical prediction 
could indicate effects beyond QCD perturbation theory. 
Explanations might include substantial non-perturbative 
effects, new production mechanisms, or additional top 
quark decay modes beyond the SM. Previous measure- 
ments @, H, H, E3] show good agreement with the theo- 
retical expectation. 

Within the SM, the top quark decays via the weak in- 
teraction to a W boson and a b quark, with a branching 
fraction Br(t -> Wb) > 0.998 [ll|. The tt pair decay 
channels are classified as follows: the dilepton channel, 
where both W bosons decay leptonically into an electron 
or a muon (ee, fj,fi, efj,); the l+jets channel, where one of 
the W bosons decays leptonically and the other hadron- 
ically (e+jets, ^t+jets); and the all-jets channel, where 
both W bosons decay hadronically. A fraction of the r 
leptons decays leptonically to an electron or a muon, and 
two neutrinos. These events have the same signature as 
events in which the W boson decays directly to an elec- 
tron or a muon and are treated as part of the signal in 
the Z+jets channel. In addition, dilepton events in which 
one of the leptons is not identified are also treated as 
part of the signal in the Z+jets channel. Two b quarks 
are present in the final state of a tt event which distin- 
guishes it from most of the background processes. As a 
consequence, identifying the bottom flavor of the corre- 
sponding jet can be used as a selection criteria to isolate 
the tt signal. 

This article presents a new measurement [l2[ of the 
it production cross section in the Z+jets channel. The 
events contain one charged lepton (e or fi) from a leptonic 
W boson decay with high transverse momentum, miss- 
ing transverse energy (|?t) from the neutrino emitted in 
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the W boson decay, two b jets from the hadronization 
of the b quarks, and two non-6 jets (it, d, s, or c) from 
the hadronic W decay; additional jets are possible due 
to initial (ISR) and final state radiation (FSR). b jets 
in the event are identified by explicitly reconstructing 
secondary vertices; the addition of the silicon microstrip 
tracker to the upgraded detector in Run II made this 
technique feasible for the first time at DO. 

This paper is organized as follows: the Run II DO de- 
tector is described in Section HT1 with special emphasis on 
those aspects that are relevant to this analysis. The trig- 
ger and event reconstruction/particle identification tech- 
niques used to select events that contain an electron or 
muon and jets are discussed in Sec. |TTT] and IIV1 The 
methods used to simulate it and background events are 
explained in Sec.|V] A data-based method that is used to 
estimate the contribution from instrumental and physics 
backgrounds to the Z+jets sample is presented in Sec. lVIl 
The methods used to estimate the efficiency and fake rate 
of the b tagging algorithm are explained in Sec. IVIII The 
means for estimating all contributions to the Z+jets sam- 
ple after tagging are detailed in Sec. IVIIII Finally, the 
description of the method used to extract the cross sec- 
tion is presented in Sec. IIXI The simulation of W boson 
events produced in association with jets is detailed in Ap- 
pendix El and the handling of the statistical uncertainty 
on the cross section extraction procedure is explained in 
Appendix [B] 



II. THE DO DETECTOR 

The DO detector [l3[ is a multi-purpose apparatus de- 
signed to study pp collisions at high energies. It consists 
of three major subsystems. At the core of the detector, 
a magnetized tracking system precisely records the tra- 
jectories of charged particles and measures their trans- 
verse momenta. A hermetic, finely-grained uranium and 
liquid argon calorimeter measures the energies of electro- 
magnetic and hadronic showers. A muon spectrometer 
measures the momenta of muons. 



A. Coordinate System 

The Cartesian coordinate system used for the DO de- 
tector is right-handed with the z axis parallel to the di- 
rection of the protons, the y axis vertical, and the x axis 
pointing out from the center of the accelerator ring. A 
particular reformulation of the polar angle 9 is given by 
the pseudorapidity defined as ry = — ln(tan#/2). In addi- 
tion, the momentum vector projected onto a plane per- 
pendicular to the beam axis (transverse momentum) is 
defined as pr — p ■ sin6. Depending on the choice of the 
origin of the coordinate system, the coordinates are re- 
ferred to as physics coordinates (</>, rf) when the origin is 
the reconstructed vertex of the interaction, or as detector 



coordinates (</>dct, f]det) when the origin is chosen to be 
the center of the DO detector. 



B. Luminosity Monitor 

The Tevatron luminosity at the DO interaction region 
is measured from the rate of inelastic pp collisions ob- 
served by the luminosity monitor (LM). The LM consists 
of two arrays of twenty-four plastic scintillator counters 
with photomultiplier readout. The arrays are located in 
front of the forward calorimeters at z = ±140 cm and 
occupy the region between the beam pipe and the for- 
ward preshower detector. The counters are 15 cm long 
and cover the pseudorapidity range 2.7 < |?7det| < 4.4. 
The uncertainty on the luminosity is currently estimated 
to be 6.1% Q. 



C. The Central Tracking System 

The purpose of the central tracking system [15j is to 
measure the momenta, directions, and signs of the elec- 
tric charges for charged particles produced in a collision. 
The silicon microstrip tracker (SMT) is located closest to 
the beam pipe and allows for an accurate determination 
of impact parameters and identification of secondary ver- 
tices. The length of the interaction region (tr ss 25 cm) 
led to the design of barrel modules interspersed with 
disks, and assemblies of disks in the forward and back- 
ward regions. The barrel detectors measure primarily 
the r-(f> coordinate, and the disk detectors measure r-z 
as well as r-cf>. The detector has six barrels in the central 
region; each barrel has four silicon readout layers, each 
composed of two staggered and overlapping sub-layers. 
Each barrel is capped at high \z\ with a disk of twelve 
double-sided wedge detectors, called an F-disk. In the far 
forward and backward regions, a unit consisting of three 
F-disks and two large-diameter H-disks provides tracking 
at high |»7det| < 3.0. Ionized charge is collected by p or n 
type silicon strips of pitch between 50 and 150 /im that 
are used to measure the position of the hits. The axial 
hit resolution is of the order of 10 /zm, the z hit resolution 
is 35 /im for 90° stereo and 450 /xm for 2° stereo detector 
modules. 

Surrounding the SMT is the central fiber tracker 
(CFT), which consists of 835 /xm diameter scintillating 
fibers mounted on eight concentric support cylinders and 
occupies the radial space from 20 to 52 cm from the 
center of the beam pipe. The two innermost cylinders 
arc 1.66 m long, and the outer six cylinders are 2.52 m 
long. Each cylinder supports one doublet layer of fibers 
oriented along the beam direction and a second doublet 
layer at a stereo angle of alternating +3° and —3°. In 
each doublet the two layers of fibers are offset by half a 
fiber width to provide improved coverage. The CFT has 
a cluster resolution of about 100 (im per doublet layer. 



6 



The momenta of charged particles are determined from 
their curvature in the 2 T magnetic field provided by a 
2.7 m long superconducting solenoid magnet [16j |. The 
superconducting solenoid, a two layer coil with mean ra- 
dius 60 cm, has a stored energy of 5 MJ and operates 
at 10 K. Inside the tracking volume, the magnetic field 
along the trajectory of any particle reaching the solenoid 
is uniform within 0.5%. The uniformity is achieved in the 
absence of a field-shaping iron return yoke by using two 
grades of conductor. The superconducting solenoid coil 
plus cryostat wall has a thickness of about 0.9 radiation 
lengths in the central region of the detector. 

Hits from both tracking detectors are combined to 
reconstruct tracks. The measured momentum resolu- 
tion of the tracker can be parameterized as ^n^r = 

/ (o.0O3pr) 3 + with the firgt term accounting for 

the measurement uncertainty of the individual hits in 
the tracker, and the second term for the multiple scat- 
tering. In the expression above, pt is the particle's 
transverse momentum (in GeV) , and L is the normalized 
track bending lever arm. L is equal to 1 for tracks with 
\r]\ < 1.62 and equal to otherwise. 9' represents the 
angle at which the track exits the tracker. 



D. The Calorimeter System 

The uranium/liquid-argon sampling calorimeters con- 
stitute the primary system used to identify electrons, 
photons, and jets. The system is subdivided into the 
central calorimeter (CC) covering roughly |?ydot| < 1 
and two end calorimeters (EC) extending the coverage 
to \r]det\ ~ 4. Each calorimeter contains an electromag- 
netic (EM) section closest to the interaction region, fol- 
lowed by fine and coarse hadronic sections with modules 
that increase in size with the distance from the inter- 
action region. Each of the three calorimeters is located 
within a cryostat that maintains the temperature at ap- 
proximately 80 K. The EM sections use thin 3 or 4 mm 
plates made from nearly pure depleted uranium. The fine 
hadronic sections are made from 6 mm thick uranium- 
niobium alloy. The coarse hadronic modules contain rela- 
tively thick 46.5 mm plates of copper in the CC and stain- 
less steel in the EC. The intercryostat region, between the 
CC and the EC calorimeters, contains additional layers 
of sampling, the scintillator-based intercryostat detector, 
to improve the energy resolution. The CC and EC con- 
tain approximately seven and nine interaction lengths of 
material respectively, ensuring containment of nearly all 
particles except high px muons and neutrinos. 

The preshower detectors are designed to improve the 
identification of electrons and photons and to correct for 
their energy losses in the solenoid during offline event 
reconstruction. The central preshower detector (CPS) is 
located in the 5 cm gap between the solenoid and the 
CC, covering the region Irydetl < 1-3. The two forward 
preshower detectors (FPSs) are attached to the faces of 



the ECs and cover the region 1.5 < |r/dct| < 2.5. The 
relative momentum resolution for the calorimeter system 
is measured in data and found to be <j{j>t)Ipt ~ 13% for 
50 GeV jets in the CC and cr(p T )/p T « 12% for 50 GeV 
jets in the ECs. The energy resolution for electrons in 
the CC is o(E)jE w 15%VE © 4%. 

E. The Muon System 

The muon system [TtJ is the outermost part of the 
DO detector. It surrounds the calorimeters and serves 
to identify and trigger on muons and to provide crude 
measurements of momentum and charge. It consists of a 
system of proportional drift tubes (PDTs) that cover the 
region of |?7det| < 1-0 and mini drift tubes (MDTs) that 
extend coverage to |?7det| ~ 2.0. Scintillation counters are 
used for triggering and for cosmic and beam-halo muon 
rejection. Toroidal magnets and special shielding com- 
plete the muon system. Each subsystem has three layers, 
with the innermost layer located between the calorimeter 
and the iron of the toroid magnet. The two remaining 
layers are located outside the iron. In the region directly 
below the CC, only partial coverage by muon detectors 
is possible to accomodate the support structure for the 
detector and the readout electronics. The average energy 
loss of a muon is 1.6 GeV in the calorimeter and 1.7 GeV 
in the iron; the momentum measurement is corrected for 
this energy loss. The average momentum resolution for 
tracks that are matched to the muon and include infor- 
mation from the SMT and the CFT is measured to be 
a(p T ) = 0.02 8 0.002p T (with p T in GeV). 

III. TRIGGERS 

The trigger system is a three-tiered pipelined system. 
The first stage (Level 1) is a hardware trigger that con- 
sists of a framework built of field programmable gate 
arrays (FPGAs) which take inputs from the luminosity 
monitor, calorimeter, central fiber tracker, and muon sys- 
tem. It makes a decision within 4.2 /xs and results in a 
trigger accept rate of about 2 kHz. In the second stage 
(Level 2), hardware processors associated with specific 
subdetectors process information that is then used by a 
global processor to determine correlations among differ- 
ent detectors. Level 2 has an accept rate of 1 kHz at 
a maximum dead-time of 5% and a maximum latency 
of 100 /is. The third stage (Level 3) uses a computing 
farm to perform a limited reconstruction of the event and 
make a trigger decision using the full event information, 
further reducing the rate for data recorded to tape to 
50 Hz. Throughout this analysis, the data sample was 
selected at the trigger level by requiring the presence of 
a lepton and a jet; however, the required quality criteria 
and thresholds differ between running periods, shown in 
chronological order in Table |U 

Samples of events recorded with unbiased triggers are 
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Trigger name 


j Cdt 
(Pb- 1 ) 


Level 1 


Level 2 


Level 3 


e+jets channel 


EM15.2JT15 


127.8 


1 EM tower, E T > 10 GeV 


le, S T > 10 GeV, EM fraction > 0.85 


1 tight e, E T > 15 GeV 






2 jet towers, pr > 5 GeV 


2 jets, E T > 10 GeV 


2 jets, p T > 15 GeV 


E1.SHT15.2J20 


244.0 


1 EM tower, E T > 11 GeV 


None 


1 tight e, St > 15 GeV 
2 jets, p T > 20 GeV 


E1_SHT15.2J_J25 


53.7 


1 EM tower, E T > 11 GeV 


1 EM cluster, E T > 15 GeV 


1 tight e, E T > 15 GeV 
2 jets, p T > 20 GeV 
1 jet, p T > 25 GeV 


jU+jets channel 


MU_JT20_L2M0 


131.5 


1 n, |?7d c t| < 2.0 
1 jet tower, pr > 5 GeV 


1 Mi hdet < 2.0 


1 jet, p T > 20 GeV 


MUJT25.L2M0 


244.0 


1 fl, |77det| < 2.0 

1 jet tower, pr > 3 GeV 


1 H, ?7det < 2.0 

1 jet, p T > 10 GeV 


1 jet, p T > 25 GeV 


MUJ2.JT25 


46.2 


1 (J,, |?7 d et| < 2.0 

1 jet tower, pt > 5 GeV 


1 n, |r? d et| < 2.0 
1 jet, p T > 8 GeV 


1 jet, p T > 25 GeV 



TABLE I: Summary of the trigger definitions used for data collection. The trigger names indicate the different running periods 
that correspond to the same trigger conditions. The integrated luminosity corresponding to each running period is shown in 
the second column. 



used to measure the probability of a single object sat- 
isfying a particular trigger requirement. Offline recon- 
structed objects are then identified in the events, and 
the efficiency is given by the fraction of these objects 
that satisfy the trigger condition under study. Single ob- 
ject efficiencies are in general parameterized as functions 
of the kinematic variables pr, T), and <f> of the offline re- 
constructed objects. The total probability for an event to 
satisfy a set of trigger requirements is obtained assuming 
that the probability for a single object to satisfy a spe- 
cific trigger condition is independent of the presence of 
other objects in the event. 

The efficiency for a ti event to satisfy a particular trig- 
ger condition is measured by folding into Monte Carlo 
(MC) simulated events the per-electron, per-muon, and 
per-jet efficiencies for individual trigger conditions at 
Level 1, Level 2, and Level 3. The total event proba- 
bility P(L1, L2, L3) is then calculated as the product of 
the probabilities for the event to satisfy the trigger con- 
ditions at each triggering level: 

P(Ll, L2, L3) = P{L1) ■ P{L2\Ll) ■ P(L3\L1, L2), 

where P(L2\L1) and P(L3\L1, L2) represent the condi- 
tional probabilities for an event to satisfy a set of criteria 
given it has already passed the offline selection and the 
requirements imposed at the previous triggering level(s). 

The overall trigger efficiency for ti events correspond- 
ing to the data samples used in this analysis is calculated 
as the luminosity- weighted average of the event probabil- 
ity associated with the trigger requirements correspond- 
ing to each running period. The systematic uncertainty 
on the trigger efficiency is obtained by varying the trigger 
efficiency parameterizations by ±ler. 



IV. EVENT RECONSTRUCTION AND 
SELECTION 

A collection of software algorithms performs the offline 
reconstruction of each event, identifying physics objects 
(tracks, primary and secondary vertices, electrons, pho- 
tons, muons, jets and their flavor, and $t) and determin- 
ing their kinematic properties. Various data samples are 
then selected based on the objects present in the event. 
The following sections describe the offline event recon- 
struction and sample selection used for this analysis. 



A. Tracks and Primary Vertex 

Charged particles leave hits in the central tracking sys- 
tem from which tracks are reconstructed. The track re- 
construction and primary vertex identification are done 
in several steps: adjacent SMT or CFT channels above a 
certain threshold are grouped into clusters; sets of clus- 
ters which lie along the path of a particle are identified; 
a road-based algorithm is used for track finding, followed 
by a Kalman filter [l8| algorithm for track fitting. The 
vertex search procedure [l9( consists of three steps: track 
clustering, track selection, and vertex finding and fitting. 
First, tracks are clustered along the z coordinate, starting 
from the track with the highest px and adding tracks to 
the ^-cluster if the distance between the position along 
z of the point of closest approach of the track to the 
z-cluster and the average z-cluster position is less than 
2 cm. The value of this cut is optimized to effectively 
cluster tracks belonging to the same interaction, while 
being able to resolve multiple interactions. Next, qual- 
ity cuts are applied to the reconstructed tracks in every 
^-cluster requiring that they have at least 2 SMT hits, 
Pt > 0.5 GeV, and that they are within three standard 
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deviations of the nominal transverse interaction position. 
Finally, for every z-cluster, a tear-down vertex search al- 
gorithm fits all selected tracks to a common vertex, ex- 
cluding individual tracks from the fit until the total ver- 
tex x 2 per degree of freedom is less than ten. The result 
of the fit is a list of reconstructed vertices that contains 
the hard scatter primary vertex (PV) and any additional 
vertices produced in minimum bias interactions. The PV 
is identified from this list based on the pr spectrum of the 
particles associated with each interaction. The log 10 pT 
distribution of tracks from minimum bias processes is 
used to define a probability for a track to come from a 
minimum bias vertex. The probability for a vertex to 
originate from a minimum bias interaction is obtained 
from the probabilities for each track and is independent 
of the number of tracks used in the calculation. The ver- 
tex with the lowest minimum bias probability is chosen 
as the PV. 

To ensure a high reconstruction quality for the PV, 
the following additional requirements have to be satis- 
fied: the position along z of the PV (PV Z ) has to be 
within 60 cm of the center of the detector and at least 
three tracks have to be fitted to form the PV. The ef- 
ficiency of the PV reconstruction is about 100% in the 
central \z\ region, but drops quickly outside the SMT 
fiducial volume (\z\ < 36 cm for the barrel) due to the 
requirement of two SMT hits per track forming the PV. 
The two tracking detectors locate the PV with a resolu- 
tion of about 35 fjum along the beamline [l3l ]. 



B. Electrons 

Electrons are reconstructed Q using information from 
the calorimeter and the central tracker. A simple cone 
algorithm of radius AR = 0.2, where AR = (A(j) 2 + 
Arj 2 ) 1 / 2 , clusters calorimeter cells around seeds with 
E T >1.5 GeV. 

An extra-loose electron is defined as an EM cluster 
that is almost entirely contained within the EM layers of 
the calorimeter, is isolated from hadronic energy deposi- 
tions, and has longitudinal and transverse shapes consis- 
tent with the expectations from simulated electrons. An 
extra-loose electron that has been spatially matched to 
a central track is called a loose electron. A loose elec- 
tron is considered tight if it passes a 7- variable likelihood 
test designed to distinguish between electrons and back- 
ground. The likelihood takes into account both tracking 
and calorimeter information, and provides more power- 
ful discrimination than individual cuts on the same vari- 
ables. 



C. Muons 

Muons are reconstructed using information from the 
muon detector and the central tracker. Local muon 
tracks are required to have hits in all three layers of the 



muon system, be consistent with production in the pri- 
mary collision based on timing information from associ- 
ated scintillator hits, and be located within |?7det| < 2.0. 
Tracks are then extended to the point of closest approach 
to the beamline, and a global fit is performed consider- 
ing all central tracks within one radian in azimuthal and 
polar angles. The central track with the highest \ 2 prob- 
ability is assigned to the muon candidate. The muon pt, 
77, and (j> are taken from the matching central track. 

To reject muons from semileptonic heavy flavor decays, 
the distance of closest approach of the muon track to the 
PV is required to be < 3<r; in addition, the muon is 
required to be isolated. Two different isolation criteria 
are used in this analysis Q: the loose muon isolation 
criterion requires that the muon be separated from jets, 
Ai?(/j,jet) > 0.5. The tight muon isolation criterion 
requires, in addition, that the muon not be surrounded 
by activity in either the calorimeter or the tracker. 



D. Jets 

Jets are reconstructed in the calorimeter using the im- 
proved legacy cone algorithm [2(| with radius 0.5 and 
a seed threshold of 0.5 GeV. A cell-selection algorithm 
keeps cells with energies at least 4ct above the average 
electronic noise and any adjacent cell with energy at 
least 2cr above the average electronic noise (T42 algo- 
rithm). Reconstructed jets are required to be confirmed 
by the independent trigger readout, have a minimum pt 
of 8 GeV, and be separated from extra-loose electrons by 
Ai?(jet,e) > 0.5. 

The pt of each reconstructed jet is corrected for 
calorimeter showering effects, overlaps due to multiple 
interactions and event pileup, calorimeter noise, and the 
energy response of the calorimeter. The calorimeter re- 
sponse is measured from the pr imbalance in photon + 
jet events. Jets containing a muon (Ai?(/i, jet) < 0.5) are 
considered to originate from a semileptonic b quark decay 
and are corrected for the momentum carried by the muon 
and the neutrino. For this correction, it is assumed that 
the neutrino carries the same momentum as the muon. 
The relative uncertainty on the jet energy calibration is 
w 7% for jets with 20 < p T < 250 GeV. 

E. Missing Et 

The presence of a neutrino in an event is inferred from 
the imbalance of the energy in the transverse plane. This 
imbalance is reconstructed from the vector sum of the 
transverse energies of the cells selected by the T42 algo- 
rithm; cells of the coarse hadronic calorimeter are only 
included if they are clustered within jets. The vector op- 
posite to this total visible energy vector is denoted the 
missing energy vector and its modulus is the raw miss- 
ing transverse energy (^?r raw ). The calorimeter missing 
transverse energy (-^tcal) i s then obtained after sub- 



9 



tracting the electromagnetic and jet response corrections 
applied to reconstructed objects in the event. Finally, the 
transverse momenta of all muons present in the event are 
subtracted (after correcting for the expected energy de- 
position of the muon in the calorimeter) to obtain the ISt 
of the event. 



F. b Jets 

The secondary vertex tagging algorithm (SVT) identi- 
fies jets arising from bottom quark hadronization (6 jets) 
by explicitly reconstructing the decay vertex of long-lived 
6-flavored hadrons within the jet. The algorithm is tuned 
to identify b jets with high efficiency, referred to as the 
b tagging efficiency, while keeping low the probability of 
tagging a light jet (from a u, d, or s quark or a gluon), 
referred to as the mistag rate. The efficiency to tag a 
jet arising from charm quark hadronization (c jets) is re- 
ferred to as the c tagging efficiency. The algorithm pro- 
ceeds in three main steps: identification of the PV, recon- 
struction of displaced secondary vertices (SVs), and the 
association of SVs with calorimeter jets. The first step is 
described in Sec. IIV Al the last two steps are described 
below. 

On average, two-thirds of the particles within a jet 
are electrically charged and are therefore detected as 
tracks in the central tracking system. For each track, 
the distance of closest approach between the track and 
the beamline is referred to as dca. The z-position of the 
projection of the dca on the beamline is referred to as 
zdca. An algorithm has been developed [l9[ to cluster 
tracks into so-called track-jets. Following the procedure 
described in Sec. IIV Al tracks are grouped according to 
their zdca with respect to z = 0. Looping in decreasing 
order of track pr, tracks are added to this pre-cluster if 
the difference between the track zdca and the pre-cluster 
z position is less than 2 cm. Next, each pre-cluster is 
associated with the vertex with the highest track mul- 
tiplicity within 2 cm of the center of the pre-cluster, 
and tracks satisfying the following criteria are selected: 
p T > 0.5 GeV, > 1 hits in the SMT barrels or F-disks, 
\dca\ < 0.2 cm, and |zdca| < 0.4 cm, where dca and zdca 
are calculated with respect to the reconstructed vertex 
associated with the pre-cluster. Finally, for each pre- 
cluster, a track-jet is formed by clustering the selected 
tracks with a simple cone algorithm of radius AR = 0.5 in 
(?7, (j)) space. The procedure adds individual tracks to the 
jet cone in decreasing order of track px, and re-computes 
the jet variables by adding the track 4-momentum. The 
process is repeated until no more seed tracks are left. 

The secondary vertex finder is applied to every track- 
jet in the event with at least two tracks. As a first step, 
the algorithm loops over all tracks selecting only those 
with dca significance \dca/a(dca)\ > 3.5. Next, the algo- 
rithm uses a build-up method that finds two-track seed 
vertices by fitting all combinations of pairs of selected 
tracks within a track-jet. Additional tracks pointing to 



the seeds are attached to the vertex if they improve the 
resulting vertex x 2 /dof. The process is repeated until no 
additional tracks can be associated with seeds. This pro- 
cedure results in vertices that might share tracks. The 
vertices found are required to satisfy the following set 
of conditions: track multiplicity > 2, vertex transverse 
decay length \L xy \ = \r*sv — rpy\ < 2.6 cm, vertex 
transverse decay length significance \L xy /a(L xy )\ > 7.0, 
Xvertex/degrees 01 freedom < 10, and | colinearity | > 0.9. 
The colinearity is defined as L xy ■ pr vt * /\L xy \\p~r vtyi \, 
where pr vtx is computed as the vector sum of the mo- 
menta of all attached tracks after the constrained fit to 
the secondary vertex. The sign of the transverse decay 
length is given by the sign of the colinearity. Secondary 
vertices composed of two tracks with opposite sign are 
required to be inconsistent with a V° hypothesis. The 
hypotheses tested by the algorithm include Kg — ► 7r + 7r~, 
A — > p + ir~ , and photon conversions (7 — > e + e~). Sec- 
ondary vertices are rejected if the invariant di-track mass 
is consistent with the tested V° mass in a mass window 
defined by ±3cr of the measured V° mass resolution. 

In the final step, a calorimeter jet is identified as a b jet 
(also called tagged) if it contains a reconstructed SV with 
L xy /a(L xy ) > 7.0 within AR < 0.5. Events containing 
one or more tagged jets are referred to as tagged events. 

G. Data Samples 

The result presented in this document is based on data 
recorded using the DO detector between August 2002 and 
March 2004. Several data samples are used at various 
stages of the analysis and are defined below. 

The fj.+jets preselected sample is based on 422 pb _1 
of data and consists of events containing one tight muon 
with pt > 20 GeV and |?7det| < 2.0 that is matched to 
a trigger muon, $t > 20 GeV separated in <f) from the 
muon direction, and at least one jet with px > 20 GeV 
and |r? d et| < 2.5. 

The e+jets preselected sample is based on 425 pb _1 of 
data and consists of events containing one tight electron 
with pt > 20 GeV and |?7dct| < 1-1 that is matched to a 
trigger electron, $t > 20 GeV separated in <\> from the 
electron direction, and at least one jet with pt > 20 GeV 
and 1 77 d et | < 2.5. 

For both the /x+jets and the e+jets preselected sam- 
ples, events containing a second high-p^ isolated lepton 
are rejected to ensure orthogonality with the dilepton 
analysis [Icl |. In addition, the samples are divided into 
four subsamples based on their jet multiplicity: 1, 2, or 
3 jets, and 4 or more jets. In each case, the leading jet is 
required to have pt > 40 GeV. 

The preselection efficiency is measured in MC ti sam- 
ples that properly take into account tau leptons that sub- 
sequently decay leptonically to an electron or a muon. 
The efficiency measured in MC is corrected by data-to- 
MC scale factors derived from control samples where the 
respective efficiency can be measured in both data and 
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MC The quoted efficiencies include the trigger ef- 
ficiency for events that pass the preselection, measured 
by folding into the MC the per-lepton and per-jet trig- 
ger efficiencies measured in data, as described in Sec. IIIII 
The resulting values for the preselection efficiency for the 
processes tt — > Z+jets and tt — > 11 are summarized in Ta- 
ble nn 

Systematic uncertainties in the preselection efficiencies 
arise from the variation of the trigger efficiencies, the 
data-to-MC scale factors, the jet energy scale and resolu- 
tion, and the jet reconstruction/identification efficiency. 

In addition to the signal samples, the following samples 
are selected for various studies: The muon-in-jet sample 
contains two reconstructed jets and a non-isolated muon 
with Ai?(/i,jet) < 0.5. The muon-in-jet- away-jet-tagged 
sample is a subset of the muon-in-jet sample, where the 
jet opposite to the one containing the muon is tagged by 
SVT. The EMqcd sample contains an extra-loose electron 
with pt > 20 GeV, at least one reconstructed jet, and 
$t < 10 GeV. The loose-minus-tight sample consists of 
events that pass the e+jets preselection, except that the 
electron passes the loose but fails the tight selection. 



V. EVENT SIMULATION 

Signal and background samples are produced using 
the MC event simulation methods described below. In 
each case, generated events are processed through the 
GEANT3-based [2l[ DO detector simulation and recon- 
structed with the same program used for collider data. 
Small additional corrections are applied to all recon- 
structed objects to improve the agreement between col- 
lider data and simulation. In particular, the momentum 
scales and resolutions for electrons and muons in the MC 
were tuned to reproduce the corresponding leptonic Z 
boson invariant mass distribution observed in data, and 
MC jets were smeared in energy according to a random 
Gaussian distribution to match the resolutions observed 
in data for the different regions of the detector. Over- 
all, good agreement is observed between reconstructed 
objects in data and MC. 

For all MC samples, the jet flavor (b, c, or light) is de- 
termined by matching the direction of the reconstructed 
jet to the hadron flavor within the cone AR < 0.5 in 
(77, (f>) space. If there is more than one hadron found 
within the cone, the jet is considered to be a b jet if the 
cone contains at least one 6-flavored hadron. It is called 
a c jet if there is at least one c-flavored hadron in the 
cone and no 6-flavored hadron. Light jets are required to 
have no b or c-flavored hadrons within AR < 0.5. 

Production and decay of the it signal are simulated us- 
ing ALPGEN 1.3 [22j |. which includes the complete 2 — * n 
partons (2 < n < 6) Born-level matrix elements, followed 
by pythia 6.2 23] to simulate the underlying event and 
the hadronization. The top quark mass is set to 175 GeV. 
EVTGEN [24[ is used to provide the various branching 
fractions and lifetimes for heavy-flavor states. The fac- 



torization and renormalization scales for the calculation 
of the tt process are set to Q = m t . MC samples are 
generated separately for the dilepton and Z-f jets signa- 
tures, according to the decay of the W bosons. Leptons 
include electrons, muons, and taus, with taus decaying 
inclusively using tauola [25| . 

The W+jets boson background is simulated using the 
same MC programs; the factorization and renormaliza- 
tion scales are set to Q 2 = + ^2 (Pt*) 2 - The events 
are subdivided into four disjoint samples with 1, 2, or 3 
jets, and 4 or more jets in the final state. Details on the 
generation of these samples can be found in Appendix [A] 

Additional samples are generated for single top quark 
production (using CompHEP [26[ followed by pythia), 
diboson production (using ALPGEN followed by pythia), 
and Z/-f* — > tt boson production (using pythia). Since 
the cross sections provided by ALPGEN correspond to LO 
calculations, correction factors are applied to scale them 
up to the NLO cross sections [2jJ. Table [TTT1 summarizes 
the generated processes with the corresponding cross sec- 
tions and NLO correction factors where applicable. For 
Z/-f* — > tt, the cross section is quoted at NNLO and 
corresponds to the mass range 60 < Mz < 130 GeV. 



VI. COMPOSITION OF THE PRESELECTED 
SAMPLES 

The preselected samples are dominated by events con- 
taining a high px isolated lepton originating from the 
decay of a W boson accompanied by jets. These events 
are referred to as IT^-like events. The samples also in- 
clude contributions from QCD multijet events in which 
a jet is misidentified as an electron (e+jets channel), or 
in which a muon originating from the semileptonic decay 
of a heavy quark appears isolated (/x+jets channel). In 
addition, substantial $t can arise from fluctuations and 
mismeasurements of the jet energies. These instrumental 
backgrounds are referred to as the QCD multijet back- 
ground, and their contribution is directly estimated from 
data, following the matrix method. 

The matrix method relies on two data sets: a tight 
sample that consists of N t events that pass the preselec- 
tion, and a loose sample that consists of Ng events that 
pass the preselection but have the tight lepton require- 
ment removed, i.e., the likelihood cut for electrons and 
the tight isolation requirement for muons are dropped. 
The number of events with leptons originating from a W 
boson decay is denoted by N slg . The number of events 
originating from QCD multijet production is denoted by 
A QCD . N e and N t can be written as: 

N e = N sis + A QCD 

N t = e sig N si z + s QC bN qcb . (2) 

£ s i g is the efficiency for a loose lepton from a W boson 
decay to pass the tight criteria; it is measured in ly+jets 
MC events, and corrected by a data-to-MC scale factor 
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e+jets 


/x+jets 




1 jet 2 jets 3 jets > 4 jets 


1 jet 2 jets 3 jets > 4 jets 


tt — * i+jets 

tt -> zz 


0.79±0.03 6.02±0.08 12.99±0.11 9.01±0.09 
4.39±0.07 11.84±0.11 3.91±0.07 0.55±0.03 


0.52±0.03 4.67±0.07 11.66±0.11 9.20±0.10 
3.15±0.06 10.20±0.10 3.70±0.07 0.50±0.03 



TABLE II: Summary of preselection efficiencies (%) for tt events. Statistical uncertainties only are quoted. 



Process 


a (pb) NLO correction 


Branching ratio 




e 




tb -> lubb 


0.88 




0.1259 


0.1253 


tbq — > lubbj 


1.98 




0.1259 


0.1253 


WW -> lujj 


2.04 


1.31 


0.3928 


0.3912 


WZ -> lujj 


0.61 


1.35 


0.3928 


0.3912 


WZ -> jjZJ 


0.18 


1.35 


0.4417 


0.4390 




0.16 


1.28 


0.4417 


0.4390 


Z/7* — » TT 


253 




0.3250 


0.3171 



TABLE III: Cross sections for background processes and the 
corresponding NLO correction factors, where applicable. 



derived from Z — ► 11 events. £qcd is the rate at which a 
loose lepton in QCD multijet events is selected as being 
tight; it is measured in a low fix data sample which is 
dominated by QCD multijet events. 

The linear system in Eq. [2] can be solved for iVQ CD 
and A^ slg ; the number of VF-like events in the preselected 
samples is obtained as 7V t slg = e S i g iV sls , and the number 
of QCD multijet events as iV t QCD = eqcdA^ 00 - The 
result is summarized in Table ITVl The systematic uncer- 
tainties on the numbers of events are obtained by varying 
e s i g and £qcd separately by one standard deviation and 
adding the results of the two variations in quadrature. As 
can be observed, IF-like events dominate the preselected 
samples. 





1 jet 


2 jets 3 jets 


> 4 jets 


e+jets 


N t 


6153 


2217 466 


119 




5806±83 


1976±50 395±23 


99.8±11.6 


N QCD 


347±18 


241±11 71±5 


19.2±2.3 


/i+jets 


Nt 


6827 


2267 439 


100 




6607±85 


2155±50 406±22 


91.4±10.7 


N QCD 


220±12 


112±10 33±5 


8.6±2.0 



TABLE IV: Numbers of preselected events and expected con- 
tributions from V^-like and QCD multijet events as a function 
of jet multiplicity. Statistical uncertainties only are quoted. 



VII. SECONDARY VERTEX b TAGGING 

Most of the non-tt processes found in the preselected 
sample do not contain heavy flavor quarks in the final 
state. Requiring that one or more of the jets in the event 



be tagged removes approximately 95% of the background 
while keeping 60% of the tt events. The performance 
of the tagging algorithm and the methods used to de- 
termine the corresponding efficiencies are described in 
this section. The efficiencies are in general parameter- 
ized as functions of jet pt and For jets that contain 
a muon, the jet pt is corrected by subtracting the ptS 
of the muon and the neutrino. For this correction the 
neutrino is assumed to carry the same px as the muon. 
This procedure preserves the relationship between the px 
and the number of tracks in a jet which would otherwise 
be biased toward lower track multiplicities for jets that 
contain muons. 



A. Jet Tagging Efficiencies 

The probability for identifying a b jet using lifetime 
tagging is conveniently broken down into two compo- 
nents: the probability for a jet to be taggable, called 
taggability, and the probability for a taggable jet to be 
tagged by the SVT algorithm, called tagging efficiency. 
This breakdown of the probability decouples the tagging 
efficiency from issues related to detector inefficiencies, 
which are absorbed into the taggability. 



1. Jet Taggability 

A calorimeter jet is considered taggable if it is matched 
within AR < 0.5 to a track-jet. The tracks in the track- 
jet are required to have at least one hit in the SMT barrel 
or F-disk, effectively reducing the SMT fiducial volume 
to w 36 cm from the center of the detector. Since this 
volume is smaller than the DO luminous region 54 cm) , 
the taggability is expected to have a strong dependence 
on the PV Z of the event. Moreover, the relative sign 
between the PV Z and the jet r\ must also be considered, as 
particular combinations of the position of the PV along 
the beam axis and the 77 of the jet would enhance or 
reduce the probability that a track-jet passes through 
the required region of the SMT. 

Taggability is measured from a combined Z+jets sam- 
ple passing the preselection criteria with the tight lepton 
requirement removed. In addition, thepr requirement on 
all the jets is reduced to 15 GeV to increase the statistics 
of the sample. No statistically significant difference be- 
tween the taggability measured in this larger sample and 
directly in the e+jets and /x+jets preselected samples is 
observed. Figure [2] shows the measured taggability as a 
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function of PV Z and sign(PVz x rj) x |PV^|. The tagga- 
bility decreases at the edges of the SMT barrel and this 
effect is much more pronounced when sign(PV z x rj) > 0. 
For this analysis, the taggability is parameterized as a 
function of jet px and \rj\ in six bins of siga(PV z x rj) x 
\PV X \: [-60,-46), [-46,-38), [-38,0), [0,20), [20,36), 
[36,60] (cm). These six regions are labeled I — VI in 
Fig. [21b) and indicated by the vertical lines. They were 
chosen by taking into consideration the edge of the SMT 
fiducial region, the amount of data available for the fits, 
and the flatness of the taggability in each region. 



(a) 



(b) 
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FIG. 2: Taggability vs. (a) PV Z and (b) sign(PV z x rj) x \PV Z \ 
as measured in data. The dashed lines correspond to the 
boundaries between regions defined in the text. 

A two-dimensional parameterization of the taggability 
vs. jet pt and \rj\ is derived by assuming that the de- 
pendence is factorizable, so that s(pt,t)) = Ce(pT)£(r])- 
The normalization factor C is such that the total number 
of observed taggable jets equals the number of predicted 
taggable jets, calculated as the sum over all reconstructed 
jets weighted by their corresponding e(pT,rj). Figure [3] 
shows s(j>t) and e(rj) for the six regions defined above. 

The assumption that the taggability can be factorized 
in terms of jet pr and r\ is verified through a valida- 
tion test [28| that compares the numbers of predicted 
and observed taggable jets as functions of jet pr, T], PV Z , 
and number of jets. For this study, the combined Z+jets 
taggability parameterization is applied separately to the 
e+jets and /i+jets preselected samples as a weight for 
each jet. Statistical uncertainties of the fits used to de- 
rive the parameterizations are assigned as errors to the 
taggability. Good agreement between predicted and ob- 
served distributions is observed for all variables. 



2. Jet Flavor Dependence of Taggability 
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FIG. 3: Taggability vs. jet p T and \rj\ for PV Z xr)<0 [(a) and 
(b) respectively] and PV Z x rj > [(c) and (d) respectively]. 
The central value is shown with a solid line, and the ±la sta- 
tistical uncertainty is shown as dotted lines. The labels I— VI 
correspond to the regions of sign(PV z x rj) x \PV Z \ defined in 
Fig.H 



corresponding to jets with low track multiplicity. The fits 
to the ratios are used as flavor dependent correction fac- 
tors to the taggability. 

The systematic uncertainty on the flavor dependence of 
the taggability is estimated by substituting the parame- 
terization for b and c quark jets with the one determined 
from Wbb and Wcc MC, respectively. The default b- 
flavor (c-flavor) parameterization is retained for the cen- 
tral value and the observed difference between that one 
and the Wbb (Wcc) parameterization is taken as the sys- 
tematic uncertainty. 

In comparison with light quark jets, hadronic tau lep- 
ton decays have a lower average track multiplicity and 
are therefore expected to have lower taggability. Fig- 
ure O shows the ratio of r to light quark jet taggability 
as functions of jet px and rj as measured in Z/j* — ► tt 
and Z/j* — > qq MC samples. The fit to the ratio is 
used as a flavor dependent correction factor to the tag- 
gability of hadronic tau decays in the estimation of the 
Z/7* — > tt background. 



The taggability measured in data is dominated by the 
predominant light quark jet contribution to the low jet 
multiplicity bins. The ratios of b to light and c to light 
taggabilities as functions of jet pt and rj are measured 
in a QCD multijet MC sample and shown in Fig. 2J The 
largest difference in taggability, approximately 5%, is ob- 
served between b and light quark jets in the low px region, 



B. Tagging Efficiency 

The b and c quark jet tagging efficiencies are measured 
in a it MC sample and calibrated to data using a data- 
to-MC scale factor derived from a sample dominated by 
semileptonic bb decays. The efficiency of tagging a light 
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FIG. 4: Ratio of the b to light (full circles) and c to light 
(open squares) quark jet taggability, measured in a QCD MC 
sample as functions of (a) jet pr and (b) jet The resulting 
fits used in the analysis are also shown. 



(a) (b) 
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FIG. 5: Ratio of the hadronic r to light quark jet taggability, 
measured in Z/j* — » rr and Z/"/* — > qq MC samples as 
functions of jet pr (a) and jet \rj\ (b). The resulting fits used 
in the analysis are also shown. 



quark jet is measured in a data sample dominated by 
light quark jets and corrected for contamination of heavy 
flavor jets and long-lived particles (Kg, A ). The proce- 
dures followed to determine each of the tagging efficien- 
cies and their corresponding uncertainties are summa- 
rized below. 



1. Semileptonic b Tagging Efficiency 

The tagging efficiency for b quarks that decay semilcp- 
tonically to muons is referred to as the semileptonic b 
tagging efficiency. It is measured in data using a system 
of eight equations (System8 Method) constructed from 
the total number of events in two samples with different 
b jet content, before and after tagging with two b tagging 
algorithms. The two data samples used are the muon-in- 
jet (n) and the muon-in-jet-away-jet-tagged sample (p) 
(see Sec. IIV Gl for the definition of these samples). The 
two b tagging algorithms are SVT and the soft lepton tag- 
ger (SLT). The SLT algorithm requires the presence of a 
muon with Ai?(^,jet) < 0.5 and p™ 1 > 0.7 GeV within 
the jet, where p^ 1 refers to the muon momentum trans- 



verse to the momentum of the jet-muon system. The jets 
are divided in two categories: b jets, and c+light (cl) jets, 
and the following system of eight equations is written: 



n 


= n b + n c i 


P 


= Pb + Pel 


n SVT 




p SYT 


= Psl VT Pb + ae^ T Pci 


n SLT 


_ S LT „ I _ S LT 

— e b n b + e cl n c i 


p SLT 


_ S LT i _ S LT 

- e b p b + s cl pd 


SVT, SLT 


- Ku r SVT r SLT nu + k , =-SVT SLT 
— I^ b t b t b IL b -f t^d t-cl b cl 11 cl 


SVT, SLT 


„ a _SVT_SLT„ , „ n _SVT_SLT„ 

— n b p e b e b p b + k c i a e cl e cl p c i 



The terms on the left hand side represent the total num- 
ber of jets in each sample before tagging (n, p) and after 
tagging with the SVT algorithm (n SVT ,p SVT ), the SLT 
algorithm (n SLT ,p SLT ), and both (^svt.slt p s vt.slt) 
The eight unknowns on the right hand side of the equa- 
tions consist of the number of b and c+light jets in the 
two samples (n b , rid, Pb, Pel), an d the tagging efficien- 
cies for b and c+light jets for the two tagging algorithms 
(e h 5VT ,e^ LT ,£^ VT ,£| LT ). The method assumes that the 
efficiency for tagging a jet with both the SVT and the 
SLT algorithm can be calculated as the product of the 
individual tagging efficiencies. Four additional parame- 
ters are needed to solve the system of equations: K b , K c i, 
a, and (3. The first two parameters represent the correla- 
tion between the SVT and the SLT tagger for b jets (Kb) 
and c+light jets (k c i), respectively. They are defined as 

SVT, SLT 

Kh -h 

6 ,SVr+SLT ' 
fc b fc b 

and 

SVT, SLT 

K , _ £ d : 

cl -SVT^SLT ■ 

(3 and a represent the ratio of the SVT tagging efficien- 
cies for b and c+light jets, respectively, corresponding to 
the two data samples used to solve System8. They are 
defined as 

p e^ VT from muon-in-jet-away-jet-tagged sample 
£? VT from muon-in-jet sample 

and 

£^ VT from muon-in-jet-away-jet-tagged sample 
£^ VT from muon-in-jet sample 

K bl Kd, and [3 are measured in a MC sample mixture of 
Z/j* —>■ bb — > /i, Z/j* —>■ cc, Z/j* — ► qq, QCD multijet, 
and it, giving K b = 0.978+0.002, k c1 = 0.826+0.014, and 
(3 = 0.999 + 0.006. a is arbitrarily chosen to be 1.0 + 0.8. 

The system of equations is solved for each px and n 
bin separately. The resulting semileptonic b tagging effi- 
ciency for the SVT algorithm is shown in Fig. [S] 
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FIG. 6: Semileptonic b tagging efficiency vs. jet pr (a) and 
jet |»7| (b) measured in data with the System8 method. The 
resulting fit is shown as a solid line, and the ±1<t statistical 
uncertainty is shown as dotted lines. 



FIG. 7: Semileptonic b tagging efficiency vs. (a) jet pr and 
(b) jet \rj\ measured in a bb MC sample. The resulting fit is 
shown as a solid line, and the ±lcr statistical uncertainty is 
shown as dotted lines. 



The statistical uncertainty is given by the error on the 
fit to the parameterization as functions of jet px and 
\rj\. The systematic uncertainties are obtained from the 
change in the semileptonic b tagging efficiency resulting 
from the variation on the correlation parameters a, /3, 
Kb and k c i . [3 and k c i are varied within the uncertainties 
obtained when the distributions of (3 and k c i as functions 
of jet pt are fitted to constants. The variation of k\, is 
determined from the difference between the value of n\, 
obtained in the MC sample described above and those 
obtained from Z/'y* — ► bb and tt MC samples. Another 
source of systematic uncertainty comes from the choice 
of the p™ 1 cut used in the SLT tagger. 



2. Measurement of the Inclusive Tagging Efficiencies 

The inclusive b and c tagging efficiencies are measured 
in a MC tt sample and calibrated by a data-to-MC scale 
factor given by the ratio of the semileptonic b tagging 
efficiency as measured in data to the one measured in a 
bb MC sample. The bb MC is chosen to determine the 
scale factor because it is expected to best simulate the 
data samples used in the System8 fit. With this proce- 
dure, the topological dependence of the tagging efficien- 
cies is taken from the tt sample, and the overall efficiency 
normalization is calibrated to data. Figure [7] shows the 
semileptonic 6 tagging efficiency as measured in the bb 
MC sample. Figure [5] shows the inclusive b and c tagging 
efficiencies that are used in the analysis. 

The systematic uncertainty on the semileptonic b tag- 
ging efficiency from MC is taken as the difference between 
the 2D parameterization obtained from bb MC and the 
one derived from a tt MC sample. For the inclusive b 
and c tagging efficiencies, the systematic uncertainty is 
taken as the difference between the 2D parameterizations 
obtained from tt MCsamples with two choices of b frag- 
mentation models [29]. In both cases, the systematic 
uncertainties in each px and 77 bin are added in quadra- 
ture to the corresponding statistical uncertainty arising 
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FIG. 8: Inclusive b tagging efficiency vs. (a) jet pr and (b) 
jet |?7| and inclusive c tagging efficiency vs. (c) jet pr and (d) 
jet |?j|. The resulting fit is shown as a solid line, and the ±lcr 
statistical uncertainty is shown as dotted lines. 



from the fit giving the default parameterization. 

A closure test [28( of the parameterized MC tagging 
efficiency is performed in each case on the MC sample 
used to derive the default parameterization. In addition, 
a validation is performed on a matched VF+jets sample 
(Appendix [A} that has passed the preselection cuts. In 
both cases, the predicted tags are compared with the 
observation as functions of jet px, rj, and jet multiplicity. 
Good agreement between prediction and observation is 
observed in all cases. 

The hadronic r tagging efficiency is measured in a 
Z/-f* — > tt MC sample and assigned a 50% systematic 
uncertainty. In this analysis, the hadronic r tagging ef- 
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ficiency is used only in the estimation of the Z/^f* 
background. 



C. Measurement of the Mistag Rate 

Mistags are defined as light flavor jets that have been 
tagged by the SVT algorithm from random overlap of 
tracks that are displaced from the PV due to tracking 
errors or resolution effects. Since the SVT algorithm 
is symmetric in its treatment of both the impact pa- 
rameter and the decay length significance L xy /o~(L xy ), 
the mistags are expected to occur at the same rate for 
positive tags (L xy /o~(L xy ) > 7.0) and for negative tags 
(L X y/a(L xy ) < —7.0). The negative tagging rate mea- 
sured in a sample dominated by light jets can therefore 
be used to extract the mistag rate after correcting for the 
contamination of heavy flavor (hf) jets in the negative 
tags, and the presence of long lived particles (11) in the 
positive tags. 

For this analysis, the negative tagging efficiency is mea- 
sured in the EMqcd data sample, which is dominated by 
QCD multijet production, and parameterized as func- 
tions of jet pt and rj, as shown in Fig. [5J A closure test 
of the parameterization is performed by comparing the 
predicted rates of negative tags to the observed one in 
the same sample used to derive the parameterizations. 
Good agreement is observed in all distributions for jet 
Pt, \f]\, and jet multiplicity. 
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FIG. 9: Negative tagging efficiency vs. (a) jet pr and (b) jet 
77 1 . The resulting fit is shown as a solid line, and the ±1<t 
statistical uncertainty is shown as dotted lines. 

The parameterized negative tag rate is also applied to 
all taggable jets in the preselected samples, and the pre- 
diction is compared to the actual number of observed 
negative tags. The results are summarized in Table [V] 
and show good agreement between prediction and obser- 
vation. 

To be able to use this measurement to estimate mistags 
from light quark jets, a correction is needed since the data 
sample is expected to contain a small contribution from 
b and c jets (sa 2% and w 4%, respectively, as predicted 
by pythia) that have a higher negative tagging efficiency 
than light quark jets. A correction factor SFhf is derived 





1 jet 


2 jet 3 jet 


> 4 jet 


e+jets channel 


jy-prcd 


24.6±5.0 


13.4±3.7 3.89±1.97 


1.54±1.24 


j\r obs 


22 


16 5 


4 


^i+jets channel 




34.3±5.9 


17.5±4.2 4.55±2.13 


1.44±1.20 


jyobs 


32 


13 6 


1 


i+jets channel 




58.9±7.7 30.9±5.6 8.44±2.90 


2.98±1.73 


jyobs 


54 


29 11 


5 



TABLE V: Numbers of observed and predicted negative tags 
in the preselected signal samples. 



from pythia QCD multijet MC as the ratio between the 
negative tagging rate for light quark jets and the one 
obtained for an inclusive jet sample 



SF hf (p T ,T]) = 



e hsht (p T ,v) 

-inclusive 



In addition, the long-lived particles present in the EMqcd 
sample lead to a larger positive than negative tagging ef- 
ficiency. A correction factor SFu is derived from pythia 
QCD multijet MC as the ratio between the positive and 
the negative tagging rates for light jets 



SF u (p T ,r)) 



E 



light 
+ 



(Pt,v) 



E h J- h \p T ,Vl) 



Both scale factors are shown in Fig. [TUJ Finally, the 
mistag rate is given by 

el sht (PT, v) = £- ata br, v)SF hf (pT, v)SF u (pt, v) • 

The systematic uncertainty on the mistag rate is de- 
termined by coherently varying by 20% the b and c frac- 
tions in the pythia QCD multijet MC sample used to 
measure SFhf and SFu. The resulting systematic uncer- 
tainty in each pj- and r\ bin is added in quadrature to the 
corresponding statistical uncertainty arising from the fit 
giving the default parameterization for g^ata^ SFhf, an d 
SF H . 



D. Event Tagging Probability 

The probability for a jet of a given flavor a (b, c, or 
light quark jet) to be tagged is obtained as the product 
of the taggability and the calibrated tagging efficiency 

"Pa (PT , V) = ^a agSab (PT , V)e a (PT , V) • 

The probability for a given MC event to contain at 
least one SVT-tagged jet is given by the complement of 
the probability that none of the jets is tagged: 



P^! n t(> 1 tag) 



I -p. 



.tag 



(0 tag) . 
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FIG. 10: Correction factors for the contribution of heavy fla- 
vor in the negative tag rate (SFhf) as functions of (a) jet pr 
and (b) jet \r]\ and contribution to the mistag rate from long 
lived particles {SFu) as functions of (c) jet pr and (d) jet 
\rj\. The resulting fits are shown as solid lines, and the ±1<t 
statistical uncertainties are shown as dotted lines. 



with 



P^* t (0 tag) = I] t 1 - tPT, , m)\ ■ 

3=1 

The probabilities for a given MC event to have exactly 
one or to have two or more SVT tagged jets are given by 

N icts 

P*^ nt ( 1 tag) = ]T V ai (pt 3 , Vj ) III 1 - V <** >»*)]' 

3=1 



and 



P^ nt (>2tag) 



P c t v a c g nt (>ltag)-F c t v a e s nt (ltag), 



respectively. P^ nt (l tag) and invent (— 2 tag) are re- 
ferred to as single and double tagging probabilities, re- 
spectively. 

The average event tagging probability for a certain pro- 
cess Pprfcoss is calculated by averaging the per-event SVT 
tagging probability over a sample of events for the pro- 
cess under consideration. The probability for an event to 
satisfy the trigger conditions is included in the calcula- 
tion, as the trigger can distort the jet pt and r\ spectra, 
particularly for the low jet multiplicity bins. 

The trigger-corrected average event tagging probabil- 
ity is measured for MC it events that pass the preselec- 
tion and originated from the processes it — > Z+jets and 
it — > U; the results are summarized in Table IVTl 



VIII. COMPOSITION OF THE TAGGED 
SAMPLE 

The main background to the tagged Z+jets sample is 
heavy flavor production in association with a W boson. 
Additional contributions arise from direct QCD heavy 
flavor production, other low rate electroweak processes 
(single top, diboson, and Z/j* — ► rr production), as well 
as mistags of light quark jets. The methods used to esti- 
mate the contribution from these background processes 
are described below. 



A. Evaluation of the W^+jets Background 

Available MC generators are able to perform matrix el- 
ement calculations for W+jets events with high jet mul- 
tiplicities only at leading order. As a result, the over- 
all normalization of the calculations suffers from large 
theoretical uncertainties, although the relative contribu- 
tions of the different processes are well described. In 
this analysis, the overall normalization of the VF+jets 
contribution is obtained directly from collider data, and 
only the relative contributions of different processes are 
taken from MC. The contribution of VF+jets events to 
the tagged sample is then estimated by multiplying the 
number of W^+jets events of each type in the preselected 
sample by the SVT efficiency corresponding to the type 
of process under consideration, as described below. 

The overall normalization of the VK-like background in 
the preselected sample before tagging (A t SIS ) is obtained 
directly from collider data as described in Sec. I VII A t SIg 
consists mostly of W+jets background events, with con- 
tributions from it and other low rate electroweak pro- 
cesses. Thus, the number of iy+jets events in the pres- 
elected sample can be calculated as 



A, 



presel 

W+jets — 



E 

bkg i 



Ni 



^y-prcscl 
ti— >/+jets 

prcscl 



AT-prcscl 
iv tf->H 



bkg 



where i loops over the electroweak backgrounds. It is im- 
portant to note that N^™°^_. and are allowed 
to float during the extraction of the it cross section, ad- 
justing the W+jets contribution accordingly. 

The predicted number of W+jets events in the tagged 
sample is obtained by multiplying the estimated num- 
ber of preselected V^+jets events by the corresponding 
average event tagging probability P^ + j ets : 



V, 



tag 

W+jets 



^ypresel 



P 



lag 



W+jets- 1 W+jets 



^w+jets ^ S obtained by adding the tagging probabili- 
ties for the different flavor configurations considered, 
weighted by their fractions within a given jet multiplicity 
bin 



^W+jets - X/ 



ptag 
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e+jets | /i+jets 




1 jet 2 jets 3 jets > 4 jets 1 jet 2 jets 3 jets > 4 jets 


tt single tag probabilities (%) 


tt — > Z+jets 
tt -> 11 


26.6±0.7 38.7±0.2 43.3±0.1 44.7±0.1 
38.8±0.2 44.7±0.1 44.9±0.2 44.6±0.5 


26.2±0.9 37.8±0.2 42.7±0.1 44.1±0.1 
38.4±0.3 44.0±0.1 44.5±0.2 44.1±0.5 


tt double tag probabilities (%) 


tt — > Z+jets 
tt -> ZZ 


4.93±0.10 11.5±0.1 15.4±0.1 
12.4±0.1 13.6±0.1 14.1±0.4 


5.06±0.11 11.5±0.1 15.2±0.1 
12.1±0.1 13.6±0.1 13.5±0.4 



TABLE VI: Summary of the average event tagging probabilities (%) for tt events that pass the preselection and originate from 
the processes tt — > Z+jets and tt 11. Statistical uncertainties only are quoted. 



F$, n gives the fraction of events that pass the preselection 
for each flavor configuration $ per jet multiplicity bin n. 
It is determined by: 

rr eS 
u <i>,n 

i off prcscl, match • ,i rr , • 

where <j^ n = er$ in • „ is the effective cross sec- 

tion, obtained by multiplying the theoretical cross section 
<7$. n from ALPGEN by the preselection and matching ef- 

n • prescl.match r i n r> , • j • , 

hciency ej, tor each flavor configuration and jet 

multiplicity. The flavor configurations considered in the 
analysis were identified according to the ad hoc match- 
ing prescription discussed in Appendix [X] and are sum- 
marized in Table IVlTl P^ s is the corresponding average 
event tagging probability, as defined in Sec. IVIIDI The 
resulting event tagging probabilities for each VF+jets fla- 
vor subprocess are summarized in Table [Villi 

The choice of cone size used for the ad hoc match- 
ing procedure contributes to the systematic uncertainty. 
To estimate this effect, the cone size is varied from the 
default value of AR = 0.5 to AR = 0.7, and the dif- 
ference, centered on the default value, is assigned as the 
systematic uncertainty on the fractions. This results in 
a relative uncertainty of 2% for the Wc fractions and 5% 
for the Wbb, W(bb), Wcc, and W(cc) fractions, in all 
jet multiplicities (refer to Appendix [X] for a definition of 
these samples). In addition, the lU+jets fractions are 
also derived from limited-statistics MC samples where 
matrix element partons are matched to particle jets fol- 
lowing the MLM matching scheme [3(3] . The difference 
between the fractions obtained from these samples and 
the ones derived from samples matched with the ad hoc 
method is less than 20% for the region of interest (events 
with three or more jets), and does not depend on the 
choice of matching parameters. An additional 20% sys- 
tematic uncertainty is assigned to the VF+jets fractions 
based on this study. 

The fractions calculated with both matching proce- 
dures are obtained from MC samples based on LO cal- 
culations. Several studies (3ll. |32| of W+2 jets processes 
have established that the ratio of Wbb to Wjj cross sec- 
tions at NLO is higher by a factor K = 1.05 ± 0.07 com- 
pared to the LO prediction. The systematic uncertainty 
on the X-factor arises from the residual dependence on 



the factorization scale and from the uncertainty on the 
PDFs, which is obtained using the 20 eigenvector pairs 
for the CTEQ6M PDFs This if-factor is applied 

to correct the ad hoc fractions of Wbb, W(bb), Wcc, and 
W(cc), while for the Wc fraction, the LO prediction is 
used. The fraction of T'F+light jets is adjusted to ensure 
that the sum of all fractions equals 1. 

Additional systematic uncertainties associated with 
the W boson modeling arise from the choice of parton dis- 
tribution functions, factorization scale, and heavy quark 
mass. The systematic uncertainty arising from each of 
these factors on the VF+jets fractions is calculated from 
the relative change in the ALPGEN cross section, prop- 
erly taking correlations into account. The PDF uncer- 
tainty is calculated using the 20 eigenvector pairs from 
CTEQ6M; the factorization scale uncertainty is calcu- 
lated by varying the scale to two times and one-half of 
the default value; the heavy quark mass uncertainty is 
calculated by varying by ±0.3 GeV [ll| the heavy quark 
masses with respect to their default values (to(, — 4.75 
GeV and m c = 1.55 GeV). 

An alternative method of obtaining the event tagging 
probability for VF+light jets is to apply the light tagging 
efficiency parameterization directly to the preselected sig- 
nal sample. Under the assumption that the preselected 
sample is dominated by W+light jets events, this method 
has the advantage of taking the kinematic information 
directly from the data. The event tagging probabilities 
obtained with this alternative method are also shown in 
Table IVIIII and are in good agreement with those ob- 
tained from MC. 

The expected number of VF+jets events for each flavor 
subprocess as a function of jet multiplicity are summa- 
rized in Tables |IX] and |X] for single and double tagged 
events, respectively. 



B. Evaluation of the QCD Multijet Background 

The QCD multijet background is evaluated by apply- 
ing the matrix method directly to the tagged samples. 
Equation [2 originally defined for the preselected data 
in Sec. IVI[ can be re-written for the single and double 
tagged samples and directly solved to obtain the num- 
ber of QCD multijet events in the tagged samples. The 
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Contribution W+l jet W+2 jets W+3 jets W +>1 jets 

Wbb 1.23±0.08 2.05±0.21 2.84±0.16 

Wcc 1.69±0.12 2.94±0.37 4.44±0.29 

W{bb) 0.86±0.03 1.46±0.09 2.03±0.15 2.99±0.24 

W(cc) 1.23±0.05 2.26±0.15 3.08±0.24 5.06±0.54 

Wc 4.41±0.18 6.25±0.43 4.93±0.48 4.30±0.23 

Vy+light 93.5+0.2 87.1+0.7 85.0+1.1 80.4+0.7 



TABLE VII: Fractions (%) of different VF+jets flavor subprocesses contributing to each jet multiplicity bin when ad hoc 
matching and preselection are required. Statistical uncertainties only are quoted. 





e+jets 


/i+jets 




W+l jet 


W+2 jets 


W+2, jets 


W+>4 jets 


W+l jet 


W+2 jets 


W+3 jets 


W+>4 jets 


Single tag probabilities (%) 


W+light 


0.40±0.01 


0.64±0.02 


0.90±0.05 


1.37±0.14 


0.39±0.01 


0.62±0.02 


0.89±0.05 


1.23±0.14 


W+light 


0.39±0.01 


0.62±0.04 


0.90±0.02 


1.32±0.06 


0.41±0.01 


0.74±0.04 


0.92±0.03 


1.23±0.05 


W(cc) 


9.3±0.1 


8.6±0.3 


8.9±0.2 


9.2±0.9 


9.4±0.1 


9.2±0.2 


8.6±0.1 


10.2±0.7 


W(bb) 


38.4±0.4 


35.4±0.6 


34.5±0.4 


34.9±1.9 


38.5±0.4 


36.3±0.6 


33.7±0.4 


35.8±1.5 


Wc 


9.6±0.1 


9.6±0.2 


9.7±0.3 


10.2±0.3 


9.6±0.1 


9.4±0.2 


9.4±0.3 


9.7±0.3 


Wcc 




15.6±0.4 


14.8±1.1 


16.4±0.6 




16.0±0.4 


16.2±0.7 


16.3±0.6 


Wbb 




43.8±0.7 


45.6±0.9 


44.5±0.9 




44.0±0.8 


44.0±1.0 


44.0±0.8 


W+jets 


1.23±0.01 


2.66±0.04 


3.59±0.05 


5.03±0.07 


1.25±0.01 


2.78±0.04 


3.57±0.04 


4.97±0.08 


Double tag probabilities (%) 


W+light 




< 0.01 


< 0.01 


< 0.01 




< 0.01 


< 0.01 


< 0.01 


W(cc) 




0.03±0.01 


0.09±0.01 


0.14±0.05 




0.04±0.01 


0.08±0.01 


0.14±0.04 


W(bb) 




0.49±0.09 


0.97±0.09 


0.52±0.11 




0.96±0.15 


0.77±0.07 


1.35±0.39 


Wc 




0.023±0.002 


0.052±0.004 


0.082±0.004 




0.030±0.002 


0.051±0.004 0.074±0.004 


Wcc 




0.76±0.04 


0.75±0.10 


0.97±0.08 




0.80±0.04 


0.94±0.10 


1.05±0.09 


Wbb 




12.2±0.5 


13.1±0.8 


14.1±0.6 




13.0±0.4 


12.5±0.7 


12.8±0.5 


W+jets 




0.17±0.01 


0.32±0.02 


0.48±0.02 




0.19±0.01 


0.31±0.01 


0.47±0.02 



TABLE VIII: Tagging probabilities (%) for preselected VK+jets events for single tags (top rows) and double tags (bottom 
rows). The uppermost row labeled W+light corresponds to the efficiencies obtained from applying the light tagging efficiency 
parameterization to the preselected signal sample. The rows labeled W+jets summarize the average event tagging probabilities 
for W boson events. These values are not used in the analysis and are included for informational purposes only. In all cases, 
statistical uncertainties only are quoted. 



rate at which a loose lepton in QCD multijet events ap- 
pears to be tight is remeasured for the tagged samples 
and found to agree with the one used for the preselected 
samples. 

As a cross check, the QCD multijet background in the 
single tagged e+jets sample is obtained by multiplying 
the number of QCD multijet events in the preselected 
sample (7V t QCD ) by the corresponding average event tag- 
ging probability Pq§ d , defined as the fraction of tagged 
events in the loose-minus-tight e+jets sample. The esti- 
mated number of tagged events is then given by 

Aftag _ atQCD ptag 
JV QCD — iv t -"QCD ' 

Good agreement is observed between the matrix method 
and the cross check. 

The cross check assumes that the heavy flavor com- 
position in the loose-minus-tight data sample, where the 
average event tagging probability is derived, is identi- 
cal to the heavy flavor composition of the QCD mul- 
tijet background in the preselected sample. In the 



e+jets channel this assumption applies, since the instru- 
mental background mainly originates from electromag- 
netically fluctuating jets misreconstructed as electrons. 
In the /Lt+jets channel however, the instrumental back- 
ground originates mainly from semileptonically decaying 
b quarks to muons; the heavy flavor fraction is therefore 
enriched when the isolation criteria is inverted, leading 
to a higher event tagging probability. As the cross check 
cannot be applied to the /i+jets channel, results from 
the matrix method are used to extract the cross section 
in both the e+jets and the /x+jets channel. 

Tables IIXI and [X] summarize the expected number of 
QCD multijet events as a function of jet multiplicity for 
single and double tag events, respectively. 

C. Physics Backgrounds 

Additional low rate electroweak processes that con- 
tribute to the tagged sample are diboson production 
(WW -> I + jets, WZ -> I + jets, WZ -> jjll, ZZ -> 
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lljj), single top quark s- and f-channel production, and 
Z/'y* — > tt — ► I + jets, where one r decays leptonically 
and the second one hadronically. The Z+jets background 
where one of the two leptons is not reconstructed is found 
to be negligible. 

For a given process i, the number of events before tag- 
ging is determined as 

where <7j, Br%, and £ stand, respectively, for the cross 
section, branching ratio, and integrated luminosity for 
the process under consideration. £ P rcscl includes the trig- 
ger efficiency for events that pass the preselection and 
is obtained by folding into the MC the per-lepton and 
per-jet trigger efficiencies measured in data. The prese- 
lection efficiency is entirely determined from MC with the 
appropriate scale factors applied. The estimated number 
of tagged events is given by N^ gi = N^ c g s fP- ag , with 

P i ag the average event tagging probability for the corre- 
sponding process. 

Tables IIXI and [X] summarize the expected number of 
events for each of the processes considered as a function 
of jet multiplicity for single and double tag events, re- 
spectively. 



D. Observed and Predicted Numbers of Tagged 
Events 

The numbers of observed and predicted single and dou- 
ble tagged events are summarized in Tables IIXI and [Xl 
respectively. Figure [11] shows the observed number of 
tagged events in data compared to the total SM back- 
ground predictions, excluding it. The background in the 
first jet multiplicity bin is dominated by W+light and 
Wc events. The contribution from heavy flavor produc- 
tion, particularly from Wbb, dominates for events with 
three or more jets. Very good agreement between ob- 
servation and background prediction is observed in the 
background-dominated first and second jet multiplicity 
bins, which gives confidence in the background estimate 
of the analysis. A clear excess of observed events over 
background is seen in the third and fourth jet multiplicity 
bins. The excess events are attributed to tt production 
and are used to extract the cross section. Figure [TJ] shows 
the observed number of tagged events in data compared 
to the total SM predictions including it. The number of 
it events shown is calculated based on the measured cross 
section. 



IX. CROSS SECTION RESULT 

The it production cross section is extracted from the 
excess of tagged events over background expectation ac- 



cording to: 

_ N \ iS — iVbkg 

° ~ Br-C- Eprcscl • P tag ' 

where Br is the branching ratio of the considered final 
state, £ is the integrated luminosity, £ pre sci is the it pre- 
selection efficiency, and P ta s is the probability for a it 
event to have one or more jets identified as b jets. 

The it production cross section is calculated by per- 
forming a maximum likelihood fit to the observed num- 
ber of events. The analysis is split into eight different 
channels: e+3 jets single tag, e+3 jets double tag, e+4 
jets single tag, e+4 jets double tag, jets single tag, 
jets double tag, /x+4 jets single tag, and /x+4 jets 
double tag. The resulting cross sections are given for 
the electron and the muon channels separately and com- 
bined. If the index 7 refers to one of the eight channels, 
the likelihood £1 to observe N° hs for a cross section a t i 
is proportional to 

£1 = n p [ iv 7 obs ' iv 7 prod ( (T «)]- ( 3 ) 

V(n,fi) — M c , generically denotes the Poisson proba- 
bility function for n observed events, given an expecta- 
tion of /x events. The predicted number of events in each 
channel is the sum of the predicted number of background 
events and the number of expected it events. Both the 
number of W+jets events before tagging and the num- 
ber of expected it events are functions of the it cross 
section that is being determined. For each iteration of 
the maximization procedure of the likelihood, the num- 
ber of it events in the untagged sample is calculated and 
the number of W^+jets is rederived. A detailed explana- 
tion of the treatment of the event statistics in the cross 
section calculation can be found in Appendix iBl 

The final cross section is determined using a nuisance 
parameter likelihood method [34| that incorporates all 
systematic uncertainties in the fit in such a way that al- 
lows them to affect the central value of the cross section. 
In this approach, each independent source of systematic 
uncertainty is modeled by a free parameter. Each nui- 
sance parameter is modeled with a Gaussian centered on 
zero and with a standard deviation of one. The nuisance 
parameters are allowed to change the central values of 
all efficiencies, tagging probabilities, and flavor fractions, 
which are allowed to vary within their uncertainties. The 
correlations are taken into account in a natural way, by 
letting the same nuisance parameter affect different vari- 
ables. The total likelihood function that is maximized is 
the product of £1 and £2, with 

£ 2 =n£(^°> 1 )' 

i 

where G(vi', 0, 1) is the normal probability of the nuisance 
parameter i to take the value i/j. 
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FIG. 11: Observed number of tagged events in data compared to the total SM background predictions (excluding it) for (a) 
single tagged events and (b) double tagged events. The total uncertainty on the background prediction is represented by the 
hatched band. The excess of observed events in the third and fourth jet multiplicity bins is attributed to it production. 
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FIG. 12: Observed number of tagged events in data compared to the total SM prediction for (a) single tagged events and (b) 
double tagged events. The number of it events shown is calculated assuming a cross section of 6.6 pb. The total uncertainty is 
represented by the hatched band. 
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e+jcts 



1 jet 



2 jets 



3 jets >4 jets 



/i+jets 

1 jet 2 jets 3 jets >4 jets 



W+light 

W(cc) 

W(bb) 

Wc 

Wcc 

Wbb 



20.9±0.7 
6.6±0.1 
18.8±0.3 
24.3±0.5 



10.1±0.7 
3.7±0.2 
9.6±0.3 

11.2±0.4 
4.9±0.2 

10.1±0.3 



2.45±0.19 
0.88±0.06 
2.25±0.16 
1.53±0.12 
1.39±0.15 
3.00±0.22 



0.59±0.13 
0.26±0.06 
0.58±0.12 
0.24±0.05 
0.40±0.09 
0.70±0.15 



24.9±0.8 
7.5±0.1 
21.6±0.4 
27.6±0.5 



13.2±0.8 
4.3±0.1 

10.9±0.3 

12.0±0.4 
5.6±0.2 

11.1±0.3 



2.63±0.19 
0.90±0.06 
2.32±0.15 
1.56±0.11 
1.62±0.13 
3.05±0.21 



0.46±0.11 
0.24±0.06 
0.50±0.12 
0.19±0.04 
0.34±0.08 
0.58±0.13 



VK+jets 



70.6±0.9 49.6±0.9 11.5±0.4 2.77±0.26 



81.6±1.0 57.1±1.0 12.1±0.4 2.31±0.23 



QCD 



6.8±1.5 10.0±1.7 5.2±1.2 2.95±0.: 



7.2±1.3 



5.8±1.3 1.57±0.89 2.77±1.02 



Single top 
Diboson 



3.30±0.07 
2.26±0.10 
0.15±0.04 



7.3±0.1 
2.75±0.11 
0.40±0.07 



1.88±0.06 
0.23±0.03 
0.03±0.01 



0.30±0.03 

< 0.01 

< 0.01 



2.65±0.05 
2.28±0.10 
0.19±0.07 



6.5±0.1 
2.94±0.11 
0.29±0.05 



1.72±0.04 
0.22±0.03 
0.09±0.05 



0.27±0.02 

< 0.01 
0.01±0.02 



iVbkg 
Syst. 



83.1±1.7 
+10.7-11.8 



70.1+2.0 
+8.5-9.0 



18.8+1.4 
+1.9-2.0 



6.0+1.1 
+0.5-0.5 



93.9+1.7 
+12.2-13.4 



72.6+1.7 
+9.3-9.9 



15.7+1.1 
+2.0-2.1 



5.4+1.1 
+0.4-0.4 



tt 
tt - 



/+jcts 
11 



1.07+0.18 
2.28+0.04 



11.7+0.3 
7.1+0.1 



27.3+0.4 
2.34+0.04 



19.8+0.3 
0.33+0.02 



0.60+0.19 
1.60+0.03 



8.0+0.4 
5.9+0.1 



23.6+0.4 
2.18+0.04 



18.8+0.4 
0.29+0.01 



A^pred 

Syst. 

iVobs 



86.5+1.7 
+10.7-11.9 
94 



88.9+2.0 
+8.3-10.4 
78 



48.5+1.4 
+2.0-3.3 
47 



26.2+1.1 
+1.0-3.5 

33 



96.1+1.7 
+12.3-13.4 
105 



86.5+1.7 
+9.8-9.8 



41.5+1.1 
+2.2-2.5 
41 



24.5+1.1 
+2.6-1.0 
26 



TABLE IX: Summary of observed (N b a ) and predicted (N pre d) numbers of single tagged events in the e+jets and the /i+jets 
channels. Uncertainties shown are statistical; the systematic uncertainties are included in the row labeled Syst. The number 
of tt events quoted is calculated assuming a cross section of 6.6 pb. 





e+jets 


/i+jets 




2 jets 


3 jets 


>4 jets 


2 jets 


3 jets 


>4 jets 


W+light 


0.017+0.003 


< 0.01 


< 0.01 


0.027+0.003 


< 0.01 


< 0.01 


W(cc) 


0.014+0.002 


< 0.01 


< 0.01 


0.019+0.003 


< 0.01 


< 0.01 


W(bb) 


0.13+0.03 


0.06+0.01 


< 0.01 


0.29+0.05 


0.05+0.01 


0.02+0.01 


Wc 


0.027+0.002 


< 0.01 


< 0.01 


0.039+0.003 


< 0.01 


< 0.01 


Wcc 


0.24+0.01 


0.07+0.01 


0.02+0.01 


0.28+0.01 


0.09+0.01 


0.02+0.01 


Wbb 


2.80+0.13 


0.86+0.08 


0.22+0.05 


3.30+0.14 


0.87+0.07 


0.17+0.04 


W+jets 


3.23+0.13 


1.00+0.08 


0.26+0.05 


3.96+0.15 


1.02+0.08 


0.22+0.04 


QCD 


< 0.01 


0.27+0.22 


< 0.01 


0.26+0.29 


< 0.01 


< 0.01 


Single top 


1.07+0.02 


0.39+0.02 


0.07+0.01 


0.93+0.01 


0.37+0.01 


0.07+0.01 


Diboson 


0.34+0.02 


0.04+0.01 


< 0.01 


0.26+0.02 


0.03+0.01 


< 0.01 


Z -> r+r" 


< 0.01 


< 0.01 


< 0.01 


< 0.01 


0.02+0.02 


< 0.01 


AW 


4.64+0.28 


1.70+0.40 


0.34+0.29 


5.42+0.33 


1.44+0.34 


0.29+0.38 


Syst. 


+0.83-0.81 


+0.26-0.25 +0.06-0.06 


+0.99-0.97 


+0.27-0.25 +0.05-0.06 


tt — > Z+jets 


1.72+0.19 


7.3+0.3 


6.9+0.2 


1.02+0.15 


6.2+0.3 


6.3+0.3 


tt 11 


1.81+0.02 


0.65+0.01 


0.09+0.01 


1.50+0.02 


0.61+0.01 


0.08+0.01 




8.2+0.3 


9.7+0.4 


7.3+0.3 


7.9+0.4 


8.3+0.3 


6.7+0.4 


Syst. 


+0.8-1.9 


+0.6-1.3 


+0.4-1.8 


+1.3-1.0 


+1.3-0.7 


+1.7-0.4 


Aobs 


12 


2 


11 


6 


3 


8 



TABLE X: Summary of observed (N b s ) and predicted (N ple d) numbers of double tagged events in the e+jets and the /i+jets 
channels. Uncertainties shown are statistical; the systematic uncertainties are included in the row labeled Syst. The number 
of tt events quoted is calculated assuming a cross section of 6.6 pb. 



The measured tt production cross sections for a top 
quark mass of 175 GeV are 

/i+jets : <T fI = 6.lli;|(stat + syst) ± 0.4 (lum) pb, 
e + jets : a tI = ej^^stat + syst) ± 0.4 (lum) pb, 
I + jets : a tI = 6.6 ± 0.9(stat + syst) ± 0.4 (lum) pb. 

The first uncertainty corresponds to the combined statis- 
tical and systematic uncertainties, and the second one to 
the luminosity error of ±6.1%. 



A complete list of systematic uncertainties is given in 
Table IXI1 where a cross indicates if the background nor- 
malization (Ab) and/or the tt efficiency (Ae) are affected 
within a given channel. The systematic uncertainties 
have been classified as uncorrelated (usually of statis- 
tical origin in cither MC or data) or correlated. The 
correlation can be between channels (i.e. e+jets and 
/i+jets) and/or between jet multiplicity bins (iVj et = 3 
and A/j Ct > 4) within a particular channel. All system- 
atic uncertainties are fully correlated between the single 
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and double tagged samples. 





c+jets 


/i+jcts 




AO 


A£ 


A U 
AO 


A - 

Ae 


Muon trigger 






X 


X 


EM trigger 


X 


X 






Muon preselection 






X 


X 


Electron preselection 


X 


X 






Preselection efficiency (MC statistics) 




X 




X 


£qcd and e s ig 


X 




X 




Matrix method (data statistics) 


X 


X 


X 


X 


W fractions (MC statistics) 


X 




X 




Jet trigger 


X 


X 


X 


X 


Jet preselection 


X 


X 


X 


X 


Taggability in data 


X 


X 


X 


X 


Flavor dependence of taggability 


X 


X 


X 


X 


Semileptonic b tagging efficiency in data 


X 


X 


X 


X 


Semileptonic 6 tagging efficiency in MC 


X 


X 


X 


X 


Inclusive b tagging efficiency in MC 


X 


X 


X 


X 


Inclusive c tagging efficiency in MC 


X 


X 


X 


X 


Negative tagging efficiency in data 


X 


X 


X 


X 


SF„ and SF hf 


X 


X 


X 


X 


W fractions 


X 




X 





TABLE XI: Summary of systematic uncertainties affecting 
the signal efficiency and/or background prediction. The labels 
correlated and uncorrelated refer to the /x+jets and e+jets 
channels. 



The nuisance parameter likelihood provides the total 
uncertainty on the cross section including contributions 
from systematic and statistical origin. To estimate the 
contribution of each individual systematic source, all but 
the corresponding nuisance parameter are fixed in the fit, 
and the maximization is redone. The statistical contribu- 
tion is then deconvoluted from the obtained uncertainty 
to extract the contribution for that particular source. 
The resulting systematic uncertainties are summarized 
in Table IMl 

The total uncertainty, excluding luminosity, is 14%. 
The main contribution of w 11% is statistical; the re- 
maining ss 8% is due to systematic effects. The primary 
contribution to the systematic uncertainties arises from 
the semileptonic b tagging efficiency measured in data. 
The second largest source of systematic uncertainty orig- 
inates from the matching of W fractions and higher-order 
effects. 

The measured cross section depends on the assumed 
mass of the top quark m t . The dependence was studied 
by repeating the analysis on MC it samples generated at 
different values of m t . The resulting dependence can be 
parameterized as a t j(m t ) = 0. 000273m;; — 0.145mt + 23. 5 
for the central value, — 0.00704m t + 2.26 for the + 1cj un- 
certainty, and 0.00687m t — 2.17 for the —la uncertainty. 
The dependence is shown in Fig. 1131 



Source 


a+ a- 


Muon trigger 
EM trigger 
Jet trigger 


0.05 0.07 
0.00 0.01 
0.00 0.01 


Muon preselection 
Electron preselection 
Jet preselection 

Preselection efficiency (MC statistics) 


0.16 0.14 
0.17 0.15 
0.13 0.11 
0.06 0.04 


eqcd and £ s ig in /^+jets channel 
£qcd and e a i g in e+jets channel 
Matrix Method (data statistics) 


0.04 0.03 
0.06 0.00 
0.15 0.15 


Taggability in data 
Flavor dependence of taggability 
Semileptonic 6 tagging efficiency in data 
Semileptonic b tagging efficiency in MC 
Inclusive 6 tagging efficiency in MC 
Inclusive c tagging efficiency in MC 
Negative tagging efficiency in data 

QJ?,, anrl QP, ,. 

orii ana or/i/ 


0.03 0.00 
0.00 0.03 
0.33 0.24 
0.17 0.04 
0.00 0.00 
0.01 0.00 
0.00 0.01 
u.ui u.uu 


W fractions 

W fractions (MC statistics) 


0.29 0.27 
0.03 0.03 


Total systematics (quad sum of the above) 


0.57 0.47 


Total uncertainty (nuisance parameter lhood) 


0.94 0.86 



TABLE XII: Systematic uncertainties in the combined i+jets 
channel. 
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FIG. 13: Top quark mass dependence of the measured cross 
section compared to the theoretical prediction [(|. 



X. CONCLUSIONS 

A measurement of the it production cross section in 
pp collisions at a center of mass energy of 1.96 TeV is 
presented in events with a lepton, a neutrino, and > 3 
jets. After a preselection of the objects in the final state, 
a lifetime b tagging algorithm which explicitly recon- 
structs secondary vertices is applied, removing approx- 
imately 95% of the background while keeping 60% of the 
it signal. The measurement combines the /^+jets and the 
e+jets channels, using 422 pb^ 1 and 425 pb -1 of data, 
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respectively. The measured tt production cross section 
for a top quark mass of 175 GeV is 

(J p p^ii+x — 6-6 ± 0.9 (stat + syst) ± 0.4 (lum) pb , 

in good agreement with SM expectations. The system- 
atic uncertainty on the result (excluding luminosity) is 
8%. This represents a factor of three reduction in 
the systematic uncertainty compared to previous publi- 
cations by the DO collaboration 0], making this result 
the most precise DO measurement of the it production 
cross section to date. 
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der von Humboldt Foundation; and the Marie Curie Pro- 
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partonic evolution given by the shower MC program 
pythia to avoid the double counting of configurations 
leading to the same final state. An approximation of 
the MLM matching [3(| (referred to as ad hoc match- 
ing) is used in the present analysis, where the matching 
is performed between matrix element partons and recon- 
structed jets. The VF+jets MC samples are used in the 
analysis according to the number of heavy flavor (c or b) 
jets in the final state, classified as follows: W+ light de- 
notes events without c or b jets; Wc denotes events with 
one c jet due to single c production; W(cc) denotes events 
with one c jet due to double c production where two c 
quarks are merged in one jet or one of the c jets is outside 
of the acceptance region; Wcc denotes events with two c 
jets; W(bb) denotes events with one b jet due to double b 
production where two b quarks are merged in one jet or 
one of the b jets is outside of the acceptance region (sin- 
gle b production is highly suppressed and neglected); and 
Wbb denotes events with two b jets. Events are kept in 
the sample if the number of reconstructed jets equals the 
number of matrix element partons, where (cc) and (bb) 
are treated as one parton. As the fourth jet multiplic- 
ity bin is treated inclusively in the analysis, all events 
with > 4 reconstructed jets are kept, independently of 
the number of additional non-matched light jets. 



APPENDIX A: MONTE CARLO GENERATION 
OF W+JETS EVENTS 

The IU+jets background is simulated using 
alpgen 1.3 [12| followed by pythia 6.2 [23| to 
simulate the underlying event and the hadronization. 
The samples are generated separately for processes 
with 1, 2, 3, and 4 or more partons in the final state, 
as summarized in Table IXIIII No parton-level cuts 
are applied on the heavy quarks (c or b) except for 
the c quark in the single c quark production process; 
the correct masses for the c and the b quark are also 
included. The processes Wcccc, Wbbcc, and Wbbbb 
are not included as their cross sections are negligible. 
W bosons are forced to decay to leptons; taus are 
subsequently forced to decay leptonically using TAUOLA. 
The respective fraction of W^tv events is adjusted in 
the overall sample to correctly reflect its contributions 
to the e+jets and /i+jets channels. 



Process 


a(pb) 


Process 


a(pb) 


Process 


a(pb) 


Process 


<7(pb) 


Wj 


1600 


Wjj 


517 


Wjjj 


163 


Wjjjj 


49.5 


Wc 


51.8 


Wcj 


28.6 


Wcjj 


19.4 


Wcjjj 


3.15 






Wbb 


9.85 


WbbJ 


5.24 


WbbJj 


2.86 






Wcc 


24.3 


WccJ 


12.5 


WccJj 


5.83 



TABLE XIII: lU+jets boson processes in alpgen and 
their cross sections for the leptonic W boson decay, a = 
a P p^w+jctsBr(W — > lu), where j =u,d,s,g and J =u,d,s,g,c. 



The leading-order parton level calculations performed 
by alpgen need to be consistently combined with the 



APPENDIX B: HANDLING OF THE EVENT 
STATISTICS UNCERTAINTIES 

The matrix method (see Sec. IVI|) is used three times 
in this analysis: to determine the number of TU-like and 
QCD multijet events in the preselected, the single, and 
the double tagged samples. The number of observed 
events used by the matrix method is subject to ran- 
dom fluctuations according to Poisson statistics and con- 
tributes to the total statistical uncertainty on the cross 
section measurement. To treat these uncertainties prop- 
erly, each number of events entering the matrix method is 
considered as a free parameter constrained to its observed 
value. This appendix details the treatment of statistical 
uncertainties arising from the number of events observed 
in data in the extraction of the cross section. 

For the preselected samples, the matrix method gives 
the number of TU-like iV si s and QCD multijet A^ QCD 
events in the tight preselected sample as 

sig N t - SQCpNe 



iV t = e QCD • 

£sig — £QCD 

The true values Ni and Nt are not known, and are left 
floating in the cross section calculation but constrained to 
their measured values Nj> and Nt using Poisson statistics. 

It is necessary to take into account that Ng and Nt are 
not independent variables. To do so, the matrix method 
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equations are expressed in terms of N t and Nt-t, the 
latter representing the number of events that are loose 
but not tight. The equations become 



N t - e QC p{Nt + N e _ t ) 

Esig — £QCD 



atQCD _ ' 
iV t — e QCD" 



^Nt + Nt-t) - N t 
£sig - £QCD 



Here, Nt and Nt-t are constrained respectively to the ob- 
served number of tight events, and to the observed num- 
ber of loose-but-not-tight events by adding the following 
factor to the likelihood function 

T(N t ;N t )xV(N^ t ;N^t), 

which represents the probability to observe Nt and Nt-t 
given their true values N t and Nt-t- 

This procedure can be repeated for the single and dou- 
ble tagged samples to predict the number of QCD mul- 
tijet events as 



N^f are two free parameters that are constrained to 
their observed values with Poisson probabilities 



V(N^ as ;N^ ag ) x T(N t nag ;Nr s )- 

In addition, the number of predicted tagged events can 
be expressed in terms of the number of expected tagged 
events originating from it, QCD multijet, W-^+jets, and 
other small electroweak backgrounds, for one and two 
tags, respectively: 



rOtai 



OOtag 



rOtag\ 



n; 

N 



_ pltag 



N h 



2tag 



_ Arltag _ 
JV QCD 



■R 



iV tt + JV QCD+- r iy 



Nw + p i^qbkK N MC bk 



N 



w - 



P MC bkg^MC bkg- 



The contribution from the small electroweak back- 
grounds (diboson, single top, and Z — > rr production) is 
labeled MC bkg to indicate that its normalization before 
tagging is obtained from MC. P^iem an d ^profess are the 
average event tagging probability for a certain process, 
for single and double tags, respectively. 

The number of T^+jets events in the preselected sam- 
ple is given by 



N 



QCD - e QCD" 



Itag _|_ jyltag\ 



N, 



ltag 



N 



QCD 



£qcd- 



£qcd 



+ iV 2tag ) 



N't 



2tag 



£QCD 



Note that the number of tight events with one tag N t ag 
and the number of tight events with two tags N t ag cor- 
respond to iV° bs in Eq. [3] in Sec. US Therefore, 7V f ltag 

and jV 2tag are already constrained to their observed val- 
ues and only one additional constraint for the number of 
events in the loose — tight sample with one and two tags 
is needed: 



P(N^_f;N^_f) x V{Nf_f;N™f), 



r ltag 
-t 



2tag\ 



N 



w 



N? 



N tT - N 



MC bkg- 



Substituting this expression for Nw into the equations 
for 7V ( ltag and 7V t 2tag above allows us to express the lat- 
ter quantities as a function of the tagging probabilities; 
signal and background efficiencies used in the matrix 
method; MC prediction for the small electroweak pro- 
cesses; and the floating parameters A^ otag , N^ t g , Nj^ t 6 , 
and N^ t s . 7V f ltag and N? tag are constrained to their ob- 
served values using Poisson statistics 



P(A\ ltag ; 



iV t ltag ) x V(N? tae 



iV 2tag ). 



The resulting likelihood is given by L\ below. The 
index i indicates the product over the channels e+3 jets, 
e+4 jets, ^+3 jet, and /x+4 jets, respectively. 



which represents the probability to observe Nj^ t s and 
Ntlf given their true values 7V" ag and 7V 2 * ag . 

Both the tight and the loose — tight sample can be 
separated into events with zero, one, or two tags. Let 



N 



t otag and N^f represent the number of events with 
zero tags in the tight and the loose — tight sample, re- 
spectively. During the maximization process, 7V t otag and 



Cl = H { V(N? tag ;N? tas ) x V(Nl tas ; 7V f ltag ) 



x V(Nf ag ; Nf ag ) x P(N£°*; A^ ag ) 
x V(N^_f; Nl ta t g ) x V{Nf a t g - Nf_f)} 
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